Search CORE

128 research outputs found

Non-asymptotic analysis of compressive Fisher discriminants in terms of the effective dimension

Author: Kaban Ata
Publication venue
Publication date: 25/02/2016
Field of study

University of Birmingham Research Portal

Classification with unknown class-conditional label noise on non-compact feature spaces

Author: Kaban Ata
Reeve Henry W J
Publication venue
Publication date: 09/06/2019
Field of study

We investigate the problem of classification in the presence of unknown class-conditional label noise in which the labels observed by the learner have been corrupted with some unknown class dependent probability. In order to obtain finite sample rates, previous approaches to classification with unknown class-conditional label noise have required that the regression function is close to its extrema on sets of large measure. We shall consider this problem in the setting of non-compact metric spaces, where the regression function need not attain its extrema. In this setting we determine the minimax optimal learning rates (up to logarithmic factors). The rate displays interesting threshold behaviour: When the regression function approaches its extrema at a sufficient rate, the optimal learning rates are of the same order as those obtained in the label-noise free setting. If the regression function approaches its extrema more gradually then classification performance necessarily degrades. In addition, we present an adaptive algorithm which attains these rates without prior knowledge of either the distributional parameters or the local density. This identifies for the first time a scenario in which finite sample rates are achievable in the label noise setting, but they differ from the optimal rates without label noise

arXiv.org e-Print Archive

University of Birmingham Research Portal

Explore Bristol Research

Robust mixtures in the presence of measurement errors

Author: Kaban Ata
Raychaudhury Somak
Sun Jianyong
Publication venue
Publication date: 01/01/2007
Field of study

We develop a mixture-based approach to robust density modeling and outlier detection for experimental multivariate data that includes measurement error information. Our model is designed to infer atypical measurements that are not due to errors, aiming to retrieve potentially interesting peculiar objects. Since exact inference is not possible in this model, we develop a tree-structured variational EM solution. This compares favorably against a fully factorial approximation scheme, approaching the accuracy of a Markov-Chain-EM, while maintaining computational simplicity. We demonstrate the benefits of including measurement errors in the model, in terms of improved outlier detection rates in varying measurement uncertainty conditions. We then use this approach in detecting peculiar quasars from an astrophysical survey, given photometric measurements with errors.Comment: (Refereed) Proceedings of the 24-th Annual International Conference on Machine Learning 2007 (ICML07), (Ed.) Z. Ghahramani. June 20-24, 2007, Oregon State University, Corvallis, OR, USA, pp. 847-854; Omnipress. ISBN 978-1-59593-793-3; 8 pages, 6 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

University of Birmingham Research Portal

A New Look at Nearest Neighbours: Identifying Benign Input Geometries via Random Projections

Author: Kaban Ata
Publication venue: JMLR
Publication date
Field of study

University of Birmingham Research Portal

Finding Young Stellar Populations in Elliptical Galaxies from Independent Components of Optical Spectra

Author: Kaban Ata
Nolan Louisa A.
Raychaudhury Somak
Publication venue
Publication date: 01/01/2005
Field of study

Elliptical galaxies are believed to consist of a single population of old stars formed together at an early epoch in the Universe, yet recent analyses of galaxy spectra seem to indicate the presence of significant younger populations of stars in them. The detailed physical modelling of such populations is computationally expensive, inhibiting the detailed analysis of the several million galaxy spectra becoming available over the next few years. Here we present a data mining application aimed at decomposing the spectra of elliptical galaxies into several coeval stellar populations, without the use of detailed physical models. This is achieved by performing a linear independent basis transformation that essentially decouples the initial problem of joint processing of a set of correlated spectral measurements into that of the independent processing of a small set of prototypical spectra. Two methods are investigated: (1) A fast projection approach is derived by exploiting the correlation structure of neighboring wavelength bins within the spectral data. (2) A factorisation method that takes advantage of the positivity of the spectra is also investigated. The preliminary results show that typical features observed in stellar population spectra of different evolutionary histories can be convincingly disentangled by these methods, despite the absence of input physics. The success of this basis transformation analysis in recovering physically interpretable representations indicates that this technique is a potentially powerful tool for astronomical data mining.Comment: 12 Pages, 7 figures; accepted in SIAM 2005 International Conference on Data Mining, Newport Beach, CA, April 200

arXiv.org e-Print Archive

Crossref

University of Birmingham Research Portal

Compressive learning of multi-layer perceptrons:an error analysis

Author: Kaban Ata
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/09/2019
Field of study

Crossref

University of Birmingham Research Portal

Dimension-free error bounds from random projections

Author: Kaban Ata
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 17/07/2019
Field of study

Learning from high dimensional data is challenging in general – however, often the data is not truly high dimensional in the sense that it may have some hidden low complexity geometry. We give new, user-friendly PAC-bounds that are able to take advantage of such benign geometry to reduce dimensional-dependence of error-guarantees in settings where such dependence is known to be essential in general. This is achieved by employing random projection as an analytic tool, and exploiting its structure-preserving compression ability. We introduce an auxiliary function class that operates on reduced dimensional inputs, and a new complexity term, as the distortion of the loss under random projections. The latter is a hypothesis-dependent data-complexity, whose analytic estimates turn out to recover various regularisation schemes in parametric models, and a notion of intrinsic dimension, as quantified by the Gaussian width of the input support in the case of the nearest neighbour rule. If there is benign geometry present, then the bounds become tighter, otherwise they recover the original dimension-dependent bounds

University of Birmingham Research Portal

Association for the Advancement of Artificial Intelligence: AAAI Publications

A New Look at Nearest Neighbours: Identifying Benign Input Geometries via Random Projections

Author: Kaban Ata
Publication venue: JMLR
Publication date: 25/02/2016
Field of study

University of Birmingham Research Portal